Goto

Collaborating Authors

 enslaved people


PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation

Liu, Yuxuan

arXiv.org Artificial Intelligence

The alignment of large language models with human values presents a critical challenge, particularly when balancing conflicting objectives like helpfulness and harmlessness. Existing approaches, such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO), face notable limitations: RLHF suffers from instability and inefficiency in multi-objective optimization, while DPO lacks mechanisms for dynamic trade-offs. To address these challenges, we propose Post-Training Extrapolation Optimization (PEO), a novel and efficient framework for bi-factorial alignment. PEO generates a family of Pareto-optimal policies in a single training pass by leveraging a three-phase pipeline: (1) aspect-specific learning, (2) generalist initialization via interpolation, and (3) post-training optimization via extrapolation. PEO enables dynamic adaptation to diverse user preferences at inference time without retraining. Our comprehensive experiments across multiple LLMs demonstrate that PEO achieves superior Pareto fronts compared to baselines, offering improved flexibility and computational efficiency. Theoretical analyses further highlight PEO's capacity to overcome optimization bottlenecks, paving the way for scalable, personalized alignment.


Measuring Social Norms of Large Language Models

Yuan, Ye, Tang, Kexin, Shen, Jianhao, Zhang, Ming, Wang, Chenguang

arXiv.org Artificial Intelligence

We present a new challenge to examine whether large language models understand social norms. In contrast to existing datasets, our dataset requires a fundamental understanding of social norms to solve. Our dataset features the largest set of social norm skills, consisting of 402 skills and 12,383 questions covering a wide set of social norms ranging from opinions and arguments to culture and laws. We design our dataset according to the K-12 curriculum. This enables the direct comparison of the social understanding of large language models to humans, more specifically, elementary students. While prior work generates nearly random accuracy on our benchmark, recent large language models such as GPT3.5-Turbo and LLaMA2-Chat are able to improve the performance significantly, only slightly below human performance. We then propose a multi-agent framework based on large language models to improve the models' ability to understand social norms. This method further improves large language models to be on par with humans. Given the increasing adoption of large language models in real-world applications, our finding is particularly important and presents a unique direction for future improvements.


We 'interviewed' Harriet Tubman using AI. It got a little weird.

Washington Post - Technology News

Harriet Tubman didn't give many interviews in her lifetime, and when she did, they were generally conducted by one of her friends, Sarah Hopkins Bradford, a White children's book author in Upstate New York, where Tubman spent the last decades of her life. The result of those interviews were two biographies, published in 1869 and 1886. Though Bradford obviously admired Tubman, the books suffer from her sometimes patronizing attitude toward her subject, her use of racial slurs and her awkward attempts to re-create the speech patterns of a Black woman raised enslaved in Maryland. Some of the long "quotes" from Tubman were completely made up, and it shows. So I was curious to see what would happen recently when I had my own "interview" with Tubman -- using the online educator Khan Academy's new artificial intelligence learning tool Khanmigo, which enables users to have live chats with dozens of simulated historical figures like Abigail Adams, Genghis Khan, Montezuma and Winston Churchill. And if so, would it come off horribly, a 21st-century minstrelsy?


Multilingual Event Extraction from Historical Newspaper Adverts

Borenstein, Nadav, Perez, Natalia da Silva, Augenstein, Isabelle

arXiv.org Artificial Intelligence

NLP methods can aid historians in analyzing textual materials in greater volumes than manually feasible. Developing such methods poses substantial challenges though. First, acquiring large, annotated historical datasets is difficult, as only domain experts can reliably label them. Second, most available off-the-shelf NLP models are trained on modern language texts, rendering them significantly less effective when applied to historical corpora. This is particularly problematic for less well studied tasks, and for languages other than English. This paper addresses these challenges while focusing on the under-explored task of event extraction from a novel domain of historical texts. We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. We find that: 1) even with scarce annotated data, it is possible to achieve surprisingly good results by formulating the problem as an extractive QA task and leveraging existing datasets and models for modern languages; and 2) cross-lingual low-resource learning for historical languages is highly challenging, and machine translation of the historical datasets to the considered target languages is, in practice, often the best-performing solution.